115 research outputs found

    The Fisher-Rao metric for projective transformations of the line

    Get PDF
    A conditional probability density function is defined for measurements arising from a projective transformation of the line. The conditional density is a member of a parameterised family of densities in which the parameter takes values in the three dimensional manifold of projective transformations of the line. The Fisher information of the family defines on the manifold a Riemannian metric known as the Fisher-Rao metric. The Fisher-Rao metric has an approximation which is accurate if the variance of the measurement errors is small. It is shown that the manifold of parameter values has a finite volume under the approximating metric. These results are the basis of a simple algorithm for detecting those projective transformations of the line which are compatible with a given set of measurements. The algorithm searches a finite list of representative parameter values for those values compatible with the measurements. Experiments with the algorithm suggest that it can detect a projective transformation of the line even when the correspondences between the components of the measurements in the domain and the range of the projective transformation are unknown

    A Fisher-Rao metric for paracatadioptric images of lines

    Get PDF
    In a central paracatadioptric imaging system a perspective camera takes an image of a scene reflected in a paraboloidal mirror. A 360° field of view is obtained, but the image is severely distorted. In particular, straight lines in the scene project to circles in the image. These distortions make it diffcult to detect projected lines using standard image processing algorithms. The distortions are removed using a Fisher-Rao metric which is defined on the space of projected lines in the paracatadioptric image. The space of projected lines is divided into subsets such that on each subset the Fisher-Rao metric is closely approximated by the Euclidean metric. Each subset is sampled at the vertices of a square grid and values are assigned to the sampled points using an adaptation of the trace transform. The result is a set of digital images to which standard image processing algorithms can be applied. The effectiveness of this approach to line detection is illustrated using two algorithms, both of which are based on the Sobel edge operator. The task of line detection is reduced to the task of finding isolated peaks in a Sobel image. An experimental comparison is made between these two algorithms and third algorithm taken from the literature and based on the Hough transform

    Self-Calibration of Cameras with Euclidean Image Plane in Case of Two Views and Known Relative Rotation Angle

    Full text link
    The internal calibration of a pinhole camera is given by five parameters that are combined into an upper-triangular 3×33\times 3 calibration matrix. If the skew parameter is zero and the aspect ratio is equal to one, then the camera is said to have Euclidean image plane. In this paper, we propose a non-iterative self-calibration algorithm for a camera with Euclidean image plane in case the remaining three internal parameters --- the focal length and the principal point coordinates --- are fixed but unknown. The algorithm requires a set of N≥7N \geq 7 point correspondences in two views and also the measured relative rotation angle between the views. We show that the problem generically has six solutions (including complex ones). The algorithm has been implemented and tested both on synthetic data and on publicly available real dataset. The experiments demonstrate that the method is correct, numerically stable and robust.Comment: 13 pages, 7 eps-figure

    Robust 3D face landmark localization based on local coordinate coding

    Get PDF
    In the 3D facial animation and synthesis community, input faces are usually required to be labeled by a set of landmarks for parameterization. Because of the variations in pose, expression and resolution, automatic 3D face landmark localization remains a challenge. In this paper, a novel landmark localization approach is presented. The approach is based on local coordinate coding (LCC) and consists of two stages. In the first stage, we perform nose detection, relying on the fact that the nose shape is usually invariant under the variations in the pose, expression, and resolution. Then, we use the iterative closest points algorithm to find a 3D affine transformation that aligns the input face to a reference face. In the second stage, we perform resampling to build correspondences between the input 3D face and the training faces. Then, an LCC-based localization algorithm is proposed to obtain the positions of the landmarks in the input face. Experimental results show that the proposed method is comparable to state of the art methods in terms of its robustness, flexibility, and accuracy

    How to Compute the Pose of an Object Without a Direct View?

    Get PDF
    Abstract. We consider the task of computing the pose of an object relative to a camera, for the case where the camera has no direct view of the object. This problem was encountered in work on vision-based inspection of specular or shiny surfaces, that is often based on analyzing images of calibration grids or other objects, reflected in such a surface. A natural setup consists thus of a camera and a calibration grid, put side-by-side, i.e. without the camera having a direct view of the grid. A straightforward idea for computing the pose is to place planar mirrors such that the camera sees the calibration grid’s reflection. In this paper, we consider this idea, describe geometrical properties of the setup and propose a practical algorithm for the pose computation. inria-00387128, version 1- 24 May 2009

    Horror image recognition based on context-aware multi-instance learning

    Get PDF
    Horror content sharing on the Web is a growing phenomenon that can interfere with our daily life and affect the mental health of those involved. As an important form of expression, horror images have their own characteristics that can evoke extreme emotions. In this paper, we present a novel context-aware multi-instance learning (CMIL) algorithm for horror image recognition. The CMIL algorithm identifies horror images and picks out the regions that cause the sensation of horror in these horror images. It obtains contextual cues among adjacent regions in an image using a random walk on a contextual graph. Borrowing the strength of the Fuzzy Support Vector Machine (FSVM), we define a heuristic optimization procedure based on the FSVM to search for the optimal classifier for the CMIL. To improve the initialization of the CMIL, we propose a novel visual saliency model based on tensor analysis. The average saliency value of each segmented region is set as its initial fuzzy membership in the CMIL. The advantage of the tensor-based visual saliency model is that it not only adaptively selects features, but also dynamically determines fusion weights for saliency value combination from different feature subspaces. The effectiveness of the proposed CMIL model is demonstrated by its use in horror image recognition on two large scale image sets collected from the Internet

    Learning human actions by combining global dynamics and local appearance

    Get PDF
    In this paper, we address the problem of human action recognition through combining global temporal dynamics and local visual spatio-temporal appearance features. For this purpose, in the global temporal dimension, we propose to model the motion dynamics with robust linear dynamical systems (LDSs) and use the model parameters as motion descriptors. Since LDSs live in a non-Euclidean space and the descriptors are in non-vector form, we propose a shift invariant subspace angles based distance to measure the similarity between LDSs. In the local visual dimension, we construct curved spatio-temporal cuboids along the trajectories of densely sampled feature points and describe them using histograms of oriented gradients (HOG). The distance between motion sequences is computed with the Chi-Squared histogram distance in the bag-of-words framework. Finally we perform classification using the maximum margin distance learning method by combining the global dynamic distances and the local visual distances. We evaluate our approach for action recognition on five short clips data sets, namely Weizmann, KTH, UCF sports, Hollywood2 and UCF50, as well as three long continuous data sets, namely VIRAT, ADL and CRIM13. We show competitive results as compared with current state-of-the-art methods

    Principal axis-based correspondence between multiple cameras for people tracking

    Full text link

    Bin ratio-based histogram distances and their application to image classification

    Get PDF
    Large variations in image background may cause partial matching and normalization problems for histogram-based representations, i.e., the histograms of the same category may have bins which are significantly different, and normalization may produce large changes in the differences between corresponding bins. In this paper, we deal with this problem by using the ratios between bin values of histograms, rather than bin values' differences which are used in the traditional histogram distances. We propose a bin ratio-based histogram distance (BRD), which is an intra-cross-bin distance, in contrast with previous bin-to-bin distances and cross-bin distances. The BRD is robust to partial matching and histogram normalization, and captures correlations between bins with only a linear computational complexity. We combine the BRD with the ℓ1 histogram distance and the χ2 histogram distance to generate the ℓ1 BRD and the χ2 BRD, respectively. These combinations exploit and benefit from the robustness of the BRD under partial matching and the robustness of the ℓ1 and χ2 distances to small noise. We propose a method for assessing the robustness of histogram distances to partial matching. The BRDs and logistic regression-based histogram fusion are applied to image classification. The experimental results on synthetic data sets show the robustness of the BRDs to partial matching, and the experiments on seven benchmark data sets demonstrate promising results of the BRDs for image classification

    A Fisher-Rao Metric for curves using the information in edges

    Get PDF
    Two curves which are close together in an image are indistinguishable given a measurement, in that there is no compelling reason to associate the measurement with one curve rather than the other. This observation is made quantitative using the parametric version of the Fisher-Rao metric. A probability density function for a measurement conditional on a curve is constructed. The distance between two curves is then defined to be the Fisher-Rao distance between the two conditional pdfs. A tractable approximation to the Fisher-Rao metric is obtained for the case in which the measurements are compound in that they consist of a point x and an angle α which specifies the direction of an edge at x. If the curves are circles or straight lines, then the approximating metric is generalized to take account of inlying and outlying measurements. An estimate is made of the number of measurements required for the accurate location of a circle in the presence of outliers. A Bayesian algorithm for circle detection is defined. The prior density for the algorithm is obtained from the Fisher-Rao metric. The algorithm is tested on images from the CASIA Iris Interval database
    • …
    corecore